Learning Reactive Policies for Probabilistic Planning Domains

نویسندگان

SungWook Yoon

Robert Givan

چکیده

We present a planning system for selecting policies in probabilistic planning domains. Our system is based on a variant of approximate policy iteration that combines inductive machine learning and simulation to perform policy improvement. Given a planning domain, the system iteratively improves the best policy found so far until no more improvement is observed or a time limit is exceeded. Though this process can be computationally intensive, the result is a reactive policy, which can then be used to quickly solve future problem instances from the planning domain. In this way, the resulting policy can be viewed as a domain-specific reactive planner for the planning domain, though it is discovered with a domainindependent technique. Thus, the initial cost of finding the policy is amortized over future problem-solving experience in the domain. Due to the system’s inductive nature, there are no performance guarantees for the selected policies. However, empirically our system has shown state-of-the-art performance in a number of benchmark planning domains, both deterministic and stochastic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Policy-Gradient Methods for Planning

Probabilistic temporal planning attempts to find good policies for acting in domains with concurrent durative tasks, multiple uncertain outcomes, and limited resources. These domains are typically modelled as Markov decision problems and solved using dynamic programming methods. This paper demonstrates the application of reinforcement learning — in the form of a policy-gradient method — to thes...

متن کامل

Discrepancy Search with Reactive Policies for Planning

We consider a novel use of mostly-correct reactive policies. In classical planning, reactive policy learning approaches could find good policies from solved trajectories of small problems and such policies have been successfully applied to larger problems of the target domains. Often, due to the inductive nature, the learned reactive policies are mostly correct but commit errors on some portion...

متن کامل

Learning to Plan Probabilistically

This paper discusses the learning of probabilistic planning without a priori domain-specific knowledge. Different from existing reinforcement learning algorithms that generate only reactive policies and existing probabilistic planning algorithms that requires a substantial amount of a priori knowledge in order to plan, we devise a two-stage bottom-up learning-to-plan process, in which first rei...

متن کامل

Action Schema Networks: Generalised Policies with Deep Learning

In this paper, we introduce the Action Schema Network (ASNet): a neural network architecture for learning generalised policies for probabilistic planning problems. By mimicking the relational structure of planning problems, ASNets are able to adopt a weight sharing scheme which allows the network to be applied to any problem from a given planning domain. This allows the cost of training the net...

متن کامل

Probabilistic Programming for Planning Problems

Probabilistic programing is an emerging field at the intersection of statistical learning and programming languages. An appealing property of probabilistic programming languages (PPL) is their support for constructing arbitrary probability distributions. This allows one to model many different domains and solve a variety of problems. We show the link between probabilistic planning and PPLs by i...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Learning Reactive Policies for Probabilistic Planning Domains

نویسندگان

چکیده

منابع مشابه

Policy-Gradient Methods for Planning

Discrepancy Search with Reactive Policies for Planning

Learning to Plan Probabilistically

Action Schema Networks: Generalised Policies with Deep Learning

Probabilistic Programming for Planning Problems

عنوان ژورنال:

اشتراک گذاری